Disambiguation of Compound Noun Translations Extracted from Bilingual Comparable Corpora
نویسنده
چکیده
Bilingual machine readable dictionaries are important and indispensable information resources for cross-language information retrieval, machine translation, and so on. In this paper, we describe a bilingual dictionary acquisition system which extracts translations from non-parallel but comparable corpora of a specific academic domain and disambiguates the extracted translations. We also experimentally evaluate the proposed method.
منابع مشابه
Disambiguation of Single Noun Translations Extracted from Bilingual Comparable Corpora
s of papers of four academic societies, namely Japan Architecture Society (JAS), Institute of Electric Engineering (IEE), Institute of Electronics and Communication Engineering (IECE), and Information Processing Society of Japan (IPSJ), published in Japan. Numbers of abstracts of each of these corpora are shown in Table 1. Parts of these bilingual corpora are parallel. The percentages of parall...
متن کاملDisambiguation of Lexical Translations Based on Bilingual Comparable Corpora
Bilingual dictionaries of machine readable form are important and indispensable information resources for cross-language information retrieval (CLIR), machine translation(MT), and so on. Speci c academic areas or technology elds become focused on in these cross language informational activities. In this paper, we describe bilingual dictionary acquisition system which extracts translations from ...
متن کاملUtilizing Clues in Syntactic Relationship for Automatic Target Word Sense Disambiguation
Multiple translations to the target language are due to several meanings of source words and various target word equivalents, depending on the context of the source word. Thus, an automated approach is presented for resolving target-word selection, based on “word-to-sense” and “sense-to-word” relationship between source words and their translations, using syntactic relationships (subject-verb, ...
متن کاملExploiting Parallel Corpora for Supervised Word-Sense Disambiguation in English-Hungarian Machine Translation
In this paper we present an experiment to automatically generate annotated training corpora for a supervised word sense disambiguation module operating in an English-Hungarian and a Hungarian-English machine translation system. Training examples for the WSD module are produced by annotating ambiguous lexical items in the source language (words having several possible translations) with their pr...
متن کاملBilingual Terminology Acquisition from Comparable Corpora and Phrasal Translation to Cross-Language Information Retrieval
The present paper will seek to present an approach to bilingual lexicon extraction from non-aligned comparable corpora, phrasal translation as well as evaluations on Cross-Language Information Retrieval. A two-stages translation model is proposed for the acquisition of bilingual terminology from comparable corpora, disambiguation and selection of best translation alternatives according to their...
متن کامل